Deaths by horse kicks

Ladislaus Josephovich Bortkiewicz was a Russian economist and statistician. He noted that the Poisson distribution can be very useful in applied statistics when describing low-frequency events in a large population. In a well-known example, he showed that the number of deaths by horse kicks in the Prussian army followed the Poisson distribution.

Consider the following two sets of observations, taken over a fixed large time interval in two different corps:

\(y\) dead soldiers 0 1 2 3 4 ≥ 5
\(n_1\) observations 109 65 22 3 1 0
\(n_2\) observations 144 91 32 11 2 0

The two datasets are independent, so we can safely join them. Another option could be to analyse one before the other, taking the posterior from the first as prior for the second: but since we’ll be using Gamma priors, the update rule of the prior parameters,

\[ \alpha_{\mathrm{post}} = \alpha_{\mathrm{prior}} + \sum_i y_i, \quad \lambda_{\mathrm{post}} = \alpha_{\mathrm{prior}} + n \]

would give the same result as for the case with the two datasets lumped together.

We’ll assume a uniform prior, and compute the posterior distribution for \(\lambda\), the death rate over the measurement time. We’ll do the same also with a Jeffrey’s prior.

y <- c(
    rep(0, 109), rep(1, 65), rep(2, 22), rep(3,  3), rep(4, 1),
    rep(0, 144), rep(1, 91), rep(2, 32), rep(3, 11), rep(4, 2)
)

# define a common function to return the Bayes stuff
# s = sum of observations
# n = number of observations
# pr_* = prior parameters
baypois <- function(s, n, pr_alpha, pr_lambda) {
    alpha <- pr_alpha + s
    lambda <- pr_lambda + n
    post <- function(x) dgamma(x, alpha, lambda)

    median <- qgamma(0.5, alpha, lambda)
    mean <- integrate(\(x) x * post(x), 0, Inf)$value
    std <- sqrt(integrate(\(x) (x - mean)^2 * post(x), 0, Inf)$value)
    cred <- sapply(c(0.025, 0.975), \(c) qgamma(c, alpha, lambda))

    list(post = post, med = median, mean = mean, std = std, cred = cred)
}

unif <- baypois(sum(y), length(y), 1, 0)
jeff <- baypois(sum(y), length(y), 0.5, 0)
Let’s put the results in a summary table and a plot.
Posterior parameters
Prior Median Mean St. dev. 95% cred. int.
Uniform 0.664 0.665 0.0372 0.594, 0.739
Jeffrey’s 0.663 0.664 0.0372 0.593, 0.738

The parameters of the two posteriors are almost identical, but we could have expected it given the high number of observations (480).

Since the distribution are basically overlapping, we’ll zoom on the region

my_pal <- wesanderson::wes_palette("Zissou1", 5)[c(1, 3, 5)]

posts <- tibble(
    lam = seq(0.5, 0.9, by = 0.0005),
    unif = unif$post(lam), jeff = jeff$post(lam)
) |>
    pivot_longer(-lam, names_to = "dist", values_to = "prob") |>
    mutate(dist = fct_relevel(dist, "unif", "jeff"))

modes <- posts |>
    group_by(dist) |>
    summarize(mode = lam[which.max(prob)], pmax = max(prob))

creds <- posts |>
    group_by(dist) |>
    filter(
        lam > lam[which.max(cumsum(prob) * 0.0005 > 0.025)] &
        lam < lam[which.max(cumsum(prob) * 0.0005 > 0.975)]
    )

posts |>
    ggplot(aes(x = lam, y = prob)) +
        geom_line(linewidth = 0.8, colour = my_pal[1]) +
        geom_area(
            data = creds,
            fill = my_pal[1], alpha = 0.5,
            position = "identity", show.legend = FALSE
        ) +
        geom_segment(
            aes(x = mode, y = 0, xend = mode, yend = pmax),
            data = modes, colour = my_pal[3], linewidth = 0.8
        ) +
        scale_colour_manual(
            values = my_pal[c(1, 3)], labels = c("Uniform", "Jeffrey’s")
        ) +
        geom_label(
            aes(
                x = mode, y = 0,
                label = paste("Mode =", format(mode, digits = 3))
            ),
            data = modes, hjust = -0.1, vjust = -0.3, colour = my_pal[3]
        ) +
        labs(
            x = "Poisson parameter λ",
            y = "Posterior probability P(λ | y)",
        ) +
        facet_wrap(
            vars(dist), nrow = 2,
            labeller = as_labeller(c(
                unif = "With uniform prior",
                jeff = "With Jeffrey’s prior"
            ))
        )

Deaths by horse kicks, but with JAGS

Now we’ll repeat the previous analysis, but this time using a Markov Chain Monte Carlo sampling. We’ll rely on the JAGS library – and its R interface rjags – to build the chain and generate samples with the Gibbs method.

library(rjags)

model <- "model {
    # data likelihood
    for (i in 1:length(x)) {
        x[i] ~ dpois(lambda)
    }

    # uniform prior for lambda
    lambda ~ dexp(0.0001)

    # predicted data, given lambda
    y ~ dpois(lambda)
}"

data <- list(x = c(
    rep(0, 109), rep(1, 65), rep(2, 22), rep(3,  3), rep(4, 1),
    rep(0, 144), rep(1, 91), rep(2, 32), rep(3, 11), rep(4, 2)
))

# create the model
jm <- jags.model(textConnection(model), data)
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 480
##    Unobserved stochastic nodes: 2
##    Total graph size: 483
## 
## Initializing model
# update the Markov chain (burn-in)
update(jm, 1000)

chain <- coda.samples(jm, c("lambda", "y"), n.iter = 1e5)

To make sure the sampling went smoothly, we can calculate the auto-correlation function and let coda compute the effective size of the chain – that is, the number of sufficiently uncorrelated samples in the chain.

acf_lam <- acf(as.mcmc(chain)[, 1], plot = FALSE)
acf_y <- acf(as.mcmc(chain)[, 2], plot = FALSE)

tibble(lag = acf_y$lag[-1], lambda = acf_lam$acf[-1], y = acf_y$acf[-1]) |>
    pivot_longer(-lag, names_to = "var", values_to = "acf") |>
    ggplot(aes(x = lag, y = acf, colour = var)) +
        geom_point(size = 1.8) +
        geom_line(aes(linetype = var), linewidth = 0.6) +
        scale_colour_manual(values = my_pal[c(1, 3)], labels = c("λ", "y")) +
        scale_linetype_manual(values = c(1, 2), labels = c("λ", "y")) +
        labs(
            x = "Lag", y = "ACF",
            colour = "Variable", linetype = "Variable"
        ) +
        coord_cartesian(ylim = c(-0.025, 0.025))

The ACF looks good both for \(\lambda\) and for \(y\), and indeed the effective sizes are…

effectiveSize(chain)
##   lambda        y 
## 101789.6 102253.1

… which is precisely the chain’s size, so we didn’t “waste” iterations.

If we look at the posterior density, we get indeed something compatible with the previous results. Let’s check all the parameters.

(chain_summ <- summary(chain))
## 
## Iterations = 1001:101000
## Thinning interval = 1 
## Number of chains = 1 
## Sample size per chain = 1e+05 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##          Mean      SD Naive SE Time-series SE
## lambda 0.6645 0.03731 0.000118      0.0001169
## y      0.6652 0.81741 0.002585      0.0025562
## 
## 2. Quantiles for each variable:
## 
##          2.5%    25%    50%    75%  97.5%
## lambda 0.5935 0.6389 0.6637 0.6892 0.7398
## y      0.0000 0.0000 0.0000 1.0000 3.0000

And the previous results for comparison.

Posterior parameters
Prior Median Mean St. dev. 95% cred. int.
Uniform 0.664 0.665 0.0372 0.594, 0.739
Jeffrey’s 0.663 0.664 0.0372 0.593, 0.738

Let’s plot now the posterior densities for \(\lambda\) and the death count \(y\).

chain_df <- as.data.frame(as.mcmc(chain))

# get means and credibility intervals from the chain statistics
means <- c(
    lambda = chain_summ$statistics[1, 1],
    y = chain_summ$statistics[2, 1]
)
creds <- c(
    lambda = unname(chain_summ$quantiles[1, c(1, 5)]),
    y = unname(chain_summ$quantiles[2, c(1, 5)])
)

ggplot(chain_df, aes(lambda)) +
    geom_density(
        linewidth = 0.8, colour = my_pal[1], fill = my_pal[1], alpha = 0.5
    ) +
    geom_vline(xintercept = means["lambda"], colour = my_pal[3]) +
    annotate(
        "label", x = means["lambda"], y = 1.5, colour = my_pal[3],
        label = paste("Mean =", format(means["lambda"], digits = 3))
    ) +
    geom_vline(
        xintercept = creds["lambda1"], colour = my_pal[3],
        linetype = "dashed"
    ) +
    annotate(
        "label", x = creds["lambda1"], y = 1.5, colour = my_pal[3],
        label = paste(
            "2.5% quantile =", format(creds["lambda1"], digits = 3)
        )
    ) +
    geom_vline(
        xintercept = creds["lambda2"], colour = my_pal[3],
        linetype = "dashed"
    ) +
    annotate(
        "label", x = creds["lambda2"], y = 1.5, colour = my_pal[3],
        label = paste(
            "97.5% quantile =", format(creds["lambda2"], digits = 3)
        )
    ) +
    labs(x = "Poisson parameter λ", y = "Sampled posterior PDF") +
    scale_x_continuous(breaks = scales::pretty_breaks()) +
    scale_y_continuous(breaks = scales::pretty_breaks())

ggplot(chain_df) +
    geom_histogram(
        aes(y, after_stat(density)), boundary = 0,
        binwidth = 1, fill = my_pal[1]
    ) +
    geom_vline(xintercept = means["y"], colour = my_pal[3]) +
    annotate(
        "label", x = means["y"], y = 0.4, colour = my_pal[3],
        label = paste("Mean =", format(means["y"], digits = 3))
    ) +
    labs(x = "Death count y", y = "Predicted PDF") +
    scale_y_continuous(breaks = scales::pretty_breaks())

Bacteria in water streams

A study on water quality of streams determined that a high level of bacterium X was defined as a concentration exceeding 100 per 100 ml of stream water. A total of \(n = 116\) samples were collected from streams with a significant environmental impact on pandas. Out of these samples, \(y = 11\) exhibited a high bacterium X level.

First, let’s find the frequentist estimator for the probability \(p\) that a sample of water taken from the stream has a high concentration of the bacterium. This is simply

\[ \hat{p} = \frac{y}{n} = \frac{11}{116} = 0.0948, \]

with standard deviation

\[ \operatorname{SD}[\hat{p}] = \sqrt{\frac{\hat{p} (1 - \hat{p})}{n}} = 0.0272 \]

Now, let’s turn to Bayesian analysis. We’ll use a \(\operatorname{Beta}(1, 10)\) prior for \(p\), and find the posterior distribution with its mean, variance and 95% credibility interval.

y <- 11
n <- 116

# posterior is a beta with alpha = alpha_p + y, beta = beta_p + n - y
post <- function(p) dbeta(p, 1 + y, 10 + n - y)

dp <- 0.001
p <- seq(0, 1, by = dp)

mode <- p[which.max(post(p))]
mean <- integrate(\(p) p * post(p), 0, 1)$value
std <- sqrt(integrate(\(p) (p - mean)^2 * post(p), 0, 1)$value)
cred <- qbeta(c(0.025, 0.975), 1 + y, 10 + n - y)
Posterior parameters
Prior Mode Mean St. dev. 95% cred. int.
Beta(1, 10) 0.088 0.0945 0.0259 0.0502, 0.1508
data.frame(p = p[p < 0.5], fp = post(p[p < 0.5])) |>
    ggplot(aes(x = p, y = fp)) +
        geom_line(
            aes(linetype = "post"),
            linewidth = 0.8, colour = my_pal[1]
        ) +
        geom_line(
            aes(y = dbeta(p, 1, 10), linetype = "prior"),
            linewidth = 0.8, colour = my_pal[1]
        ) +
        geom_area(
            data = \(df) df |> filter(p > cred[1] & p < cred[2]),
            fill = my_pal[1], alpha = 0.5
        ) +
        geom_segment(
            aes(x = mode, xend = mode, y = 0, yend = post(mode)),
            linewidth = 0.6, colour = my_pal[3]
        ) +
        annotate(
            "label", x = mode, y = 2.5,
            label = paste("Mode =", format(mode, digits = 3)),
            hjust = 1.05, colour = my_pal[3]
        ) +
        geom_segment(
            aes(x = mean, xend = mean, y = 0, yend = post(mean)),
            linewidth = 0.6, colour = my_pal[2]
        ) +
        annotate(
            "label", x = mean, y = 5,
            label = paste("Mean =", format(mean, digits = 3)),
            hjust = -0.05, colour = my_pal[2]
        ) +
        scale_linetype_discrete(
            labels = c("Posterior", "Beta(1, 10) prior")
        ) +
        labs(
            x = "Binomial parameter p", y = "PDF for p",
            linetype = element_blank()
        )

We can now test the hypothesis

\[ \text{$H_0$: } p = 0.1 \text{ versus $H_1$: } p \neq 0.1 \]

at a 5% level of significance, first in the frequentist way and then in the Bayesian one.

In the frequentist approach, the null distribution is a binomial with \(n = 116\) and \(p = 0.1\). We have to check whether \(y = 11\) lies above or below the level of significance.

x <- 1:30
nullpdf <- dbinom(x, n, 0.1)
nullcdf <- pbinom(x, n, 0.1)

# get interval ends
sig1 <- x[max(which(nullcdf < 0.025))]
sig2 <- x[min(which(nullcdf > 0.975))]

Given the discrete nature of the distribution, we’re actually looking at a 4.3% significance level, but it’s the closest option to 5% with the current distribution.

tibble(
    data = x, prob = nullpdf,
    flag = ifelse(x == y, "obs", ifelse(sig1 < x & x < sig2, "acc", "rej"))
) |>
    ggplot(aes(x = data, y = prob)) +
        geom_col(aes(fill = flag), show.legend = FALSE) +
        scale_fill_manual(
            values = c(acc = my_pal[1], obs = my_pal[2], rej = my_pal[3]),
        ) +
        geom_vline(
            xintercept = c(sig1 + 0.5, sig2 - 0.5), colour = my_pal[3]
        ) +
        annotate(
            "label", label = "← Rej. H₀", colour = my_pal[3],
            x = sig1 + 0.5, y = 0.05, hjust = 1.1
        ) +
        annotate(
            "label", label = "Rej. H₀ →", colour = my_pal[3],
            x = sig2 - 0.5, y = 0.05, hjust = -0.1
        ) +
        labs(
            x = "Contaminated samples y",
            y = paste0("Null distribution Binom(y | n = ", n, ", p = 0.1)"),
        ) +
        coord_cartesian(expand = FALSE, ylim = c(0, 0.13))

So, the observed count of \(y = 11\) is well within the acceptance region of the null hypothesis, so we have to reject the alternative hypothesis that \(p \neq 0.1\) at a 5% level of significance.

Let’s do it in the Bayesian way now. We have to find the credibility interval associated to the chosen significance level, and check whether \(p_0 = 0.1\), the binomial probability of the null hypothesis, lies inside or outside the interval.

p <- seq(0, 0.25, length.out = 1000)
data.frame(
    prob = p, post = post(p),
    flag = ifelse(cred[1] < p & p < cred[2], "acc", "rej")
) |>
    ggplot(aes(x = prob, y = post)) +
        geom_line(linewidth = 0.8, colour = my_pal[1]) +
        # fill acceptance region
        geom_area(
            aes(y = ifelse(flag == "acc", post, 0)),
            fill = my_pal[1], alpha = 0.5
        ) +
        # fill rejection region
        geom_area(
            aes(y = ifelse(flag == "rej", post, 0)),
            fill = my_pal[3], alpha = 0.5
        ) +
        geom_segment(
            aes(x = 0.1, xend = 0.1, y = 0, yend = post(0.1)),
            colour = my_pal[2]
        ) +
        annotate(
            "label", label = "p₀ = 0.1", x = 0.1, y = 5,
            hjust = 1.1, colour = my_pal[2]
        ) +
        geom_vline(xintercept = cred, colour = my_pal[3]) +
        annotate(
            "label", label = "← Rej. H₀", colour = my_pal[3],
            x = cred[1], y = 5, hjust = 1.1
        ) +
        annotate(
            "label", label = "Rej. H₀ →", colour = my_pal[3],
            x = cred[2], y = 5, hjust = -0.1
        ) +
        scale_fill_manual(values = c(acc = my_pal[1], rej = my_pal[3])) +
        labs(
            x = "Binomial parameter p",
            y = "Posterior probability P(p | y, n)"
        ) +
        coord_cartesian(expand = FALSE, ylim = c(0, 16.5))

As we could expect, the Bayesian test rejects the alternative hypothesis too. So according to both frameworks we can’t say that \(p \neq 0.1\) at a 5% level of significance. Indeed, we got pretty similar values in both the frequentist estimator and the mean/mode of the posterior distribution.

Now, let’s consider a new measurement, performed one month later on \(n' = 165\) water samples, which saw \(y' = 9\) samples with high concentration of the bacterium. Assuming that the probability of contamination might not be the same as one month ago, a frequentist would calculate a new value for the estimator of \(p\) as

\[ \hat{p}' = \frac{y'}{n'} = \frac{9}{165} = 0.0545, \]

with standard deviation

\[ \operatorname{SD}[\hat{p}'] = \sqrt{\frac{\hat{p}'(1 - \hat{p}')}{n'}} = 0.0177. \]

Now, let’s perform the same hypothesis test as before.

y <- 9
n <- 165

x <- 1:35
nullpdf <- dbinom(x, n, 0.1)
nullcdf <- pbinom(x, n, 0.1)

# get interval ends
sig1 <- x[max(which(nullcdf < 0.025))]
sig2 <- x[min(which(nullcdf > 0.975))]

tibble(
    data = x, prob = nullpdf,
    flag = ifelse(x == y, "obs", ifelse(sig1 < x & x < sig2, "acc", "rej"))
) |>
    ggplot(aes(x = data, y = prob)) +
        geom_col(aes(fill = flag), show.legend = FALSE) +
        scale_fill_manual(
            values = c(acc = my_pal[1], obs = my_pal[2], rej = my_pal[3]),
        ) +
        geom_vline(
            xintercept = c(sig1 + 0.5, sig2 - 0.5), colour = my_pal[3]
        ) +
        annotate(
            "label", label = "← Rej. H₀", colour = my_pal[3],
            x = sig1 + 0.5, y = 0.05, hjust = 1.1
        ) +
        annotate(
            "label", label = "Rej. H₀ →", colour = my_pal[3],
            x = sig2 - 0.5, y = 0.05, hjust = -0.1
        ) +
        labs(
            x = "Contaminated samples y",
            y = paste0("Null distribution Binom(y | n = ", n, ", p = 0.1)"),
        ) +
        coord_cartesian(expand = FALSE, ylim = c(0, 0.13))

The observed count of \(y' = 9\) contaminated samples is still within the acceptance region of the null hypothesis, so we can’t reject \(p = 0.1\) at the chosen significance level of 5%.

Let’s conduct the Bayesian analysis now. We can begin by using the same prior as before, assuming the indipendence of the two measurements.

alpha_pr <- 1
beta_pr <- 10

post <- function(p) dbeta(p, alpha_pr + y, beta_pr + n - y)

dp <- 0.001
p <- seq(0, 1, by = dp)

mode <- p[which.max(post(p))]
mean <- integrate(\(p) p * post(p), 0, 1)$value
std <- sqrt(integrate(\(p) (p - mean)^2 * post(p), 0, 1)$value)
cred <- qbeta(c(0.025, 0.975), alpha_pr + y, beta_pr + n - y)
Posterior parameters
Prior Mode Mean St. dev. 95% cred. int.
Beta(1, 10) 0.052 0.0568 0.0174 0.0277, 0.0954
data.frame(p = p[p < 0.3], fp = post(p[p < 0.3])) |>
    ggplot(aes(x = p, y = fp)) +
        geom_line(
            aes(linetype = "post"),
            linewidth = 0.8, colour = my_pal[1]
        ) +
        geom_line(
            aes(y = dbeta(p, alpha_pr, beta_pr), linetype = "prior"),
            linewidth = 0.8, colour = my_pal[1]
        ) +
        geom_area(
            data = \(df) df |> filter(p > cred[1] & p < cred[2]),
            fill = my_pal[1], alpha = 0.5
        ) +
        geom_segment(
            aes(x = mode, xend = mode, y = 0, yend = post(mode)),
            linewidth = 0.6, colour = my_pal[3]
        ) +
        annotate(
            "label", x = mode, y = 2.5,
            label = paste("Mode =", format(mode, digits = 3)),
            hjust = 1.05, colour = my_pal[3]
        ) +
        geom_segment(
            aes(x = mean, xend = mean, y = 0, yend = post(mean)),
            linewidth = 0.6, colour = my_pal[2]
        ) +
        annotate(
            "label", x = mean, y = 5,
            label = paste("Mean =", format(mean, digits = 3)),
            hjust = -0.05, colour = my_pal[2]
        ) +
        scale_linetype_discrete(
            labels = c(
                "Posterior",
                paste0("Beta(", alpha_pr, ", ", beta_pr, ") prior")
            )
        ) +
        labs(
            x = "Binomial parameter p", y = "PDF for p",
            linetype = element_blank()
        )

Then, let’s test the alternative hypothesis \(p \neq 0.1\).

p <- seq(0, 0.2, length.out = 1000)
data.frame(
    prob = p, post = post(p),
    flag = ifelse(cred[1] < p & p < cred[2], "acc", "rej")
) |>
    ggplot(aes(x = prob, y = post)) +
        geom_line(linewidth = 0.8, colour = my_pal[1]) +
        # fill acceptance region
        geom_area(
            aes(y = ifelse(flag == "acc", post, 0)),
            fill = my_pal[1], alpha = 0.5
        ) +
        # fill rejection region
        geom_area(
            aes(y = ifelse(flag == "rej", post, 0)),
            fill = my_pal[3], alpha = 0.5
        ) +
        geom_segment(
            aes(x = 0.1, xend = 0.1, y = 0, yend = post(0.1)),
            colour = my_pal[2]
        ) +
        annotate(
            "label", label = "p₀ = 0.1", x = 0.1, y = 2.5,
            hjust = -0.1, colour = my_pal[2]
        ) +
        geom_vline(xintercept = cred, colour = my_pal[3]) +
        annotate(
            "label", label = "← Rej. H₀", colour = my_pal[3],
            x = cred[1], y = 10, hjust = 1.1
        ) +
        annotate(
            "label", label = "Rej. H₀ →", colour = my_pal[3],
            x = cred[2], y = 10, hjust = -0.1
        ) +
        scale_fill_manual(values = c(acc = my_pal[1], rej = my_pal[3])) +
        labs(
            x = "Binomial parameter p",
            y = "Posterior probability P(p | y, n)"
        ) +
        coord_cartesian(expand = FALSE, ylim = c(0, 30))

This time \(p_0 = 0.1\) lies outside the acceptance region of the null hypothesis, so we can accept the alternative hypothesis and say that \(p \neq 0.1\) at a 5% level of significance.

Let’s see if this changes with a different choice of prior. In particular, let’s try incorporating the information from the older measurement, using the posterior we obtained in that case as the new prior.

alpha_pr <- 1 + 11
beta_pr <- 10 + 116 - 11

post <- function(p) dbeta(p, alpha_pr + y, beta_pr + n - y)

mode <- p[which.max(post(p))]
mean <- integrate(\(p) p * post(p), 0, 1)$value
std <- sqrt(integrate(\(p) (p - mean)^2 * post(p), 0, 1)$value)
cred <- qbeta(c(0.025, 0.975), alpha_pr + y, beta_pr + n - y)
Posterior parameters
Prior Mode Mean St. dev. 95% cred. int.
Beta(12, 115) 0.0689 0.0719 0.0151 0.0452, 0.1042

This time the mean and mode are a bit far from the frequentist estimator, but this is not surprising since we’re weighing more the values around \(p = 0.95\) with this prior choice.

data.frame(p = p[p < 0.3], fp = post(p[p < 0.3])) |>
    ggplot(aes(x = p, y = fp)) +
        geom_line(
            aes(linetype = "post"),
            linewidth = 0.8, colour = my_pal[1]
        ) +
        geom_line(
            aes(y = dbeta(p, alpha_pr, beta_pr), linetype = "prior"),
            linewidth = 0.8, colour = my_pal[1]
        ) +
        geom_area(
            data = \(df) df |> filter(p > cred[1] & p < cred[2]),
            fill = my_pal[1], alpha = 0.5
        ) +
        geom_segment(
            aes(x = mode, xend = mode, y = 0, yend = post(mode)),
            linewidth = 0.6, colour = my_pal[3]
        ) +
        annotate(
            "label", x = mode, y = 2.5,
            label = paste("Mode =", format(mode, digits = 3)),
            hjust = 1.05, colour = my_pal[3]
        ) +
        geom_segment(
            aes(x = mean, xend = mean, y = 0, yend = post(mean)),
            linewidth = 0.6, colour = my_pal[2]
        ) +
        annotate(
            "label", x = mean, y = 5,
            label = paste("Mean =", format(mean, digits = 3)),
            hjust = -0.05, colour = my_pal[2]
        ) +
        scale_linetype_discrete(
            labels = c(
                "Posterior",
                paste0("Beta(", alpha_pr, ", ", beta_pr, ") prior")
            )
        ) +
        labs(
            x = "Binomial parameter p", y = "PDF for p",
            linetype = element_blank()
        )

So, let’s test the hypothesis \(p \neq 0.1\) as before.

p <- seq(0, 0.2, length.out = 1000)
data.frame(
    prob = p, post = post(p),
    flag = ifelse(cred[1] < p & p < cred[2], "acc", "rej")
) |>
    ggplot(aes(x = prob, y = post)) +
        geom_line(linewidth = 0.8, colour = my_pal[1]) +
        # fill acceptance region
        geom_area(
            aes(y = ifelse(flag == "acc", post, 0)),
            fill = my_pal[1], alpha = 0.5
        ) +
        # fill rejection region
        geom_area(
            aes(y = ifelse(flag == "rej", post, 0)),
            fill = my_pal[3], alpha = 0.5
        ) +
        geom_segment(
            aes(x = 0.1, xend = 0.1, y = 0, yend = post(0.1)),
            colour = my_pal[2]
        ) +
        annotate(
            "label", label = "p₀ = 0.1", x = 0.1, y = 2.5,
            hjust = 1.1, colour = my_pal[2]
        ) +
        geom_vline(xintercept = cred, colour = my_pal[3]) +
        annotate(
            "label", label = "← Rej. H₀", colour = my_pal[3],
            x = cred[1], y = 10, hjust = 1.1
        ) +
        annotate(
            "label", label = "Rej. H₀ →", colour = my_pal[3],
            x = cred[2], y = 10, hjust = -0.1
        ) +
        scale_fill_manual(values = c(acc = my_pal[1], rej = my_pal[3])) +
        labs(
            x = "Binomial parameter p",
            y = "Posterior probability P(p | y, n)"
        ) +
        coord_cartesian(expand = FALSE, ylim = c(0, 30))

And here we can’t reject the null hypothesis, since \(p_0\) lies inside the acceptance region of \(H_0\). So, to agree with the frequentist result we had to incorporate the information about the previous measurement, or at least we should have chosen a prior that weighed less the lower probabilities with respect to \(\operatorname{Beta}(1, 10)\).